Improved detection of DNA motifs using a self-organized clustering of familial binding profiles

نویسندگان

  • Shaun Mahony
  • Aaron Golden
  • Terry J. Smith
  • Panayiotis V. Benos
چکیده

MOTIVATION One of the limiting factors in deciphering transcriptional regulatory networks is the effectiveness of motif-finding software. An emerging avenue for improving motif-finding accuracy aims to incorporate generalized binding constraints of related transcription factors (TFs), named familial binding profiles (FBPs), as priors in motif identification methods. A motif-finder can thus be 'biased' towards finding motifs from a particular TF family. However, current motif-finders allow only a single FBP to be used as a prior in a given motif-finding run. In addition, current FBP construction methods are based on manual clustering of position specific scoring matrices (PSSMs) according to the known structural properties of the TF proteins. Manual clustering assumes that the binding preferences of structurally similar TFs will also be similar. This assumption is not true, at least not for some TF families. Automatic PSSM clustering methods are thus required for augmenting the usefulness of FBPs. RESULTS A novel method is developed for automatic clustering of PSSM models. The resulting FBPs are incorporated into the SOMBRERO motif-finder, significantly improving its performance when finding motifs related to those that have been incorporated. SOMBRERO is thus the only existing de novo motif-finder that can incorporate knowledge of all known PSSMs in a given motif-finding run. AVAILABILITY The methods outlined will be incorporated into the next release of SOMBRERO, which is available from http://bioinf.nuigalway.ie/sombrero

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

MotifOrganizer: a scalable model-based motif clustering tool for mammalian genomes.

Assembling a comprehensive catalog of all transcription factors (TFs) and the genes that they regulate (regulon) is important for understanding gene regulation. The sequence-specific conserved binding profiles of TFs can be characterized from whole genome sequences with phylogenetic approaches, and a large number of such profiles have been released. Effective mining of these data sources could ...

متن کامل

A robust wavelet based profile monitoring and change point detection using S-estimator and clustering

Some quality characteristics are well defined when treated as response variables and are related to some independent variables. This relationship is called a profile. Parametric models, such as linear models, may be used to model profiles. However, in practical applications due to the complexity of many processes it is not usually possible to model a process using parametric models.In these cas...

متن کامل

DNA Familial Binding Profiles Made Easy: Comparison of Various Motif Alignment and Clustering Strategies

Transcription factor (TF) proteins recognize a small number of DNA sequences with high specificity and control the expression of neighbouring genes. The evolution of TF binding preference has been the subject of a number of recent studies, in which generalized binding profiles have been introduced and used to improve the prediction of new target sites. Generalized profiles are generated by alig...

متن کامل

Development of a Sensitive Quantitative Competitive PCR Assay for Detection of Human Cytomegalovirus DNA

Accurate and rapid diagnosis of human cytomegalovirus (HCMV) disease in immunocompromised patients has remained as a challenge. Quantitative competitive PCR (QC-PCR) methods for detection of HCMV in these individuals have improved the positive and negative predictive values of PCR for diagnosis of HCMV disease. In this study we used QC-PCR assay, using a co-amplified DNA standard, to quantitate...

متن کامل

DNA Fingerprinting Based on Repetitive Sequences of Iranian Indigenous Lactobacilli Species by (GTG)5- REP-PCR

Background and Objective: The use of lactobacilli as probiotics requires the application of accurate and reliable methods for the detection and identification of bacteria at the strain level. Repetitive sequence-based polymerase chain reaction (rep-PCR), a DNA fingerprinting technique, has been successfully used as a powerful molecular typing method to determine taxonomic and phylogenetic relat...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Bioinformatics

دوره 21 Suppl 1  شماره 

صفحات  -

تاریخ انتشار 2005